8 |
An Introduction to Complex Systems: Making Sense of a Changing World
|
|
|
|
In: Faculty Books (2019)
|
|
BASE
|
|
Show details
|
|
9 |
Should we use movie subtitles to study linguistic patterns of conversational speech? A study based on French, English and Taiwan Mandarin
|
|
|
|
In: Third International Symposium on Linguitic Patters of Spontaneous Speech ; https://hal.archives-ouvertes.fr/hal-02385689 ; Third International Symposium on Linguitic Patters of Spontaneous Speech, Nov 2019, Taipei, Taiwan ; http://lpss2019.ling.sinica.edu.tw/ (2019)
|
|
BASE
|
|
Show details
|
|
10 |
Segmentability Differences Between Child-Directed and Adult-Directed Speech: A Systematic Test With an Ecologically Valid Corpus
|
|
|
|
In: EISSN: 2470-2986 ; Open Mind ; https://hal.archives-ouvertes.fr/hal-02274050 ; Open Mind, MIT Press, 2019, 3, pp.13-22. ⟨10.1162/opmi_a_00022⟩ (2019)
|
|
BASE
|
|
Show details
|
|
11 |
A computational account of virtual travelers in the Montagovian generative lexicon
|
|
|
|
In: The Semantics of Dynamic Space in French ; https://hal.archives-ouvertes.fr/hal-02093536 ; Michel Aurnague; Dejan Stosic. The Semantics of Dynamic Space in French, John Benjamins, pp.407-450, 2019, Part IV. Formal and computational aspects of motion-based narrations, 9789027203205. ⟨10.1075/hcp.66.09lef⟩ ; https://benjamins.com/catalog/hcp.66.09lef (2019)
|
|
BASE
|
|
Show details
|
|
12 |
Towards TreeLex++: Syntactico-Semantic Lexical Resource for French
|
|
|
|
In: Language & Technology Conference ; https://hal.archives-ouvertes.fr/hal-02120183 ; Language & Technology Conference, May 2019, Poznan, Poland (2019)
|
|
BASE
|
|
Show details
|
|
13 |
On the integration of linguistic features into statistical and neural machine translation
|
|
|
|
In: Vanmassenhove, Eva Odette Jef orcid:0000-0003-1162-820X (2019) On the integration of linguistic features into statistical and neural machine translation. PhD thesis, Dublin City University. (2019)
|
|
BASE
|
|
Show details
|
|
14 |
Neural machine translation for multimodal interaction
|
|
Dutta Chowdhury, Koel. - : Dublin City University. School of Computing, 2019. : Dublin City University. ADAPT, 2019
|
|
In: Dutta Chowdhury, Koel (2019) Neural machine translation for multimodal interaction. Master of Science thesis, Dublin City University. (2019)
|
|
BASE
|
|
Show details
|
|
15 |
Cross-lingual parsing with polyglot training and multi-treebank learning: a Faroese case study
|
|
|
|
In: Barry, James orcid:0000-0003-3051-585X , Wagner, Joachim orcid:0000-0002-8290-3849 and Foster, Jennifer orcid:0000-0002-7789-4853 (2019) Cross-lingual parsing with polyglot training and multi-treebank learning: a Faroese case study. In: The 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), 3 - 5 Nov 2019, Hong Kong, China. ISBN 978-1-950737-78-9 (2019)
|
|
BASE
|
|
Show details
|
|
16 |
Selecting artificially-generated sentences for fine-tuning neural machine translation
|
|
|
|
In: Poncelas, Alberto orcid:0000-0002-5089-1687 and Way, Andy orcid:0000-0001-5736-5930 (2019) Selecting artificially-generated sentences for fine-tuning neural machine translation. In: 12th International Conference on Natural Language Generation, 29 Oct - 1 Nov 2019, Tokyo, Japan. (2019)
|
|
BASE
|
|
Show details
|
|
17 |
Automatic processing of code-mixed social media content
|
|
Barman, Utsab. - : Dublin City University. School of Computing, 2019. : Dublin City University. ADAPT, 2019
|
|
In: Barman, Utsab (2019) Automatic processing of code-mixed social media content. PhD thesis, Dublin City University. (2019)
|
|
Abstract:
Code-mixing or language-mixing is a linguistic phenomenon where multiple language mix together during conversation. Standard natural language processing (NLP) tools such as part-of-speech (POS) tagger and parsers perform poorly because such tools are generally trained with monolingual content. Thus there is a need for code-mixed NLP. This research focuses on creating a code-mixed corpus in English-Hindi-Bengali and using it to develop a world-level language identifier and a POS tagger for such code-mixed content. The first target of this research is word-level language identification. A data set of romanised and code-mixed content written in English, Hindi and Bengali was created and annotated. Word-level language identification (LID) was performed on this data using dictionaries and machine learn- ing techniques. We find that among a dictionary-based system, a character-n-gram based linear model, a character-n-gram based first order Conditional Random Fields (CRF) and a recurrent neural network in the form of a Long Short Term Memory (LSTM) that consider words as well as characters, LSTM outperformed the other methods. We also took part in the First Workshop of Computational Approaches to Code-Switching, EMNLP, 2014 where we achieved the highest token-level accuracy in the word-level language identification task of Nepali-English. The second target of this research is part-of-speech (POS) tagging. POS tagging methods for code- mixed data (e.g. pipeline and stacked systems and LSTM-based neural models) have been implemented, among them, neural approach outperformed the other approach. Further, we investigate building a joint model to perform language identification and POS tagging jointly. We compare between a factorial CRF (FCRF) based joint model and three LSTM-based multi-task models for word-level language identification and POS tagging. The neural models achieve good accuracy in language identification and POS tagging by outperforming the FCRF approach. Further- more, we found that it is better to go for a multi-task learning approach than to perform individual task (e.g. language identification and POS tagging) using neural approach. Comparison between the three neural approaches revealed that without using task-specific recurrent layers, it is possible to achieve good accuracy by careful handling of output layers for these two tasks e.g. LID and POS tagging.
|
|
Keyword:
Computational linguistics; Machine learning
|
|
URL: http://doras.dcu.ie/22919/
|
|
BASE
|
|
Hide details
|
|
18 |
Automatic error classification with multiple error labels
|
|
|
|
In: Popović, Maja orcid:0000-0001-8234-8745 and Vilar, David (2019) Automatic error classification with multiple error labels. In: MT Summit XVII, 19 - 23 Aug 2019, Dublin, Ireland. (2019)
|
|
BASE
|
|
Show details
|
|
19 |
Improving transductive data selection algorithms for machine translation
|
|
Poncelas, Alberto. - : Dublin City University. School of Computing, 2019. : Dublin City University. ADAPT, 2019
|
|
In: Poncelas, Alberto orcid:0000-0002-5089-1687 (2019) Improving transductive data selection algorithms for machine translation. PhD thesis, Dublin City University. (2019)
|
|
BASE
|
|
Show details
|
|
20 |
Combining SMT and NMT back-translated data for efficient NMT
|
|
|
|
In: Poncelas, Alberto orcid:0000-0002-5089-1687 , Popović, Maja orcid:0000-0001-8234-8745 , Shterionov, Dimitar orcid:0000-0001-6300-797X , Maillette de Buy Wenniger, Gideon and Way, Andy orcid:0000-0001-5736-5930 (2019) Combining SMT and NMT back-translated data for efficient NMT. In: Recent Advances in Natural Language Processing (RANLP 2019), 2-4 Sept 2019, Varna, Bulgaria. (2019)
|
|
BASE
|
|
Show details
|
|
|
|